Resonant anomaly detection with multiple reference datasets
نویسندگان
چکیده
A bstract An important class of techniques for resonant anomaly detection in high energy physics builds models that can distinguish between reference and target datasets, where only the latter has appreciable signal. Such techniques, including Classification Without Labels (CWoLa) Simulation Assisted Likelihood-free Anomaly Detection (SALAD) rely on a single dataset. They cannot take advantage commonly-available multiple datasets thus fully exploit available information. In this work, we propose generalizations CWoLa SALAD settings are available, building weak supervision techniques. We demonstrate improved performance number with realistic synthetic data. As an added benefit, our enable us to provide finite-sample guarantees, improving existing asymptotic analyses.
منابع مشابه
Compressed Anomaly Detection with Multiple Mixed Observations
We consider a collection of independent random variables that are identically distributed, except for a small subset which follows a different, anomalous distribution. We study the problem of detecting which random variables in the collection are governed by the anomalous distribution. Recent work proposes to solve this problem by conducting hypothesis tests based on mixed observations (e.g. li...
متن کاملAnomaly Detection in Categorical Datasets Using Bayesian Networks
In this paper we present a method for finding anomalous records in categorical or mixed datasets in an unsupervised fashion. Since the data in many problems consist of normal records with a small minority of anomalies, many approaches build a model from the training data and compare the test records against it. But instead of building a model, we keep track of the number of occurrences of diffe...
متن کاملStatistical Anomaly Detection Technique for Real Time Datasets
Data mining is the technique of discovering patterns among data to analyze patterns or decision making predictions. Anomaly detection is the technique of identifying occurrences that deviate immensely from the large amount of data samples. Advances in computing generates large amount of data from different sources, which is very difficult to apply machine learning techniques due to existence of...
متن کاملAn integrated pan-tropical biomass map using multiple reference datasets.
We combined two existing datasets of vegetation aboveground biomass (AGB) (Proceedings of the National Academy of Sciences of the United States of America, 108, 2011, 9899; Nature Climate Change, 2, 2012, 182) into a pan-tropical AGB map at 1-km resolution using an independent reference dataset of field observations and locally calibrated high-resolution biomass maps, harmonized and upscaled to...
متن کامل3D Gabor Based Hyperspectral Anomaly Detection
Hyperspectral anomaly detection is one of the main challenging topics in both military and civilian fields. The spectral information contained in a hyperspectral cube provides a high ability for anomaly detection. In addition, the costly spatial information of adjacent pixels such as texture can also improve the discrimination between anomalous targets and background. Most studies miss the wort...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of High Energy Physics
سال: 2023
ISSN: ['1127-2236', '1126-6708', '1029-8479']
DOI: https://doi.org/10.1007/jhep07(2023)188